AITopics | intermediate feature map

Collaborating Authors

intermediate feature map

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Semantic Image Synthesis with Unconditional Generator

Neural Information Processing SystemsDec-25-2025, 20:03:13 GMT

Semantic image synthesis (SIS) aims to generate realistic images according to semantic masks given by a user. Although recent methods produce high quality results with fine spatial control, SIS requires expensive pixel-level annotation of the training images. On the other hand, manipulating intermediate feature maps in a pretrained unconditional generator such as StyleGAN supports coarse spatial control without heavy annotation. In this paper, we introduce a new approach, for reflecting user's detailed guiding masks on a pretrained unconditional generator. Our method converts a user's guiding mask to a proxy mask through a semantic mapper.

name change, semantic image synthesis, unconditional generator, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (0.67)
Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

RISC-V Based TinyML Accelerator for Depthwise Separable Convolutions in Edge AI

Yildirim, Muhammed, Ozturk, Ozcan

arXiv.org Artificial IntelligenceNov-27-2025

Abstract--The increasing demand for on-device intelligence in Edge AI and TinyML applications requires the efficient execution of modern Convolutional Neural Networks (CNNs). While lightweight architectures like MobileNetV2 employ Depth-wise Separable Convolutions (DSC) to reduce computational complexity, their multi-stage design introduces a critical performance bottleneck inherent to layer-by-layer execution: the high energy and latency cost of transferring intermediate feature maps to either large on-chip buffers or off-chip DRAM. T o address this memory wall, this paper introduces a novel hardware accelerator architecture that utilizes a fused pixel-wise dataflow. Implemented as a Custom Function Unit (CFU) for a RISC-V processor, our architecture eliminates the need for intermediate buffers entirely, reducing the data movement up to 87% compared to conventional layer-by-layer execution. It computes a single output pixel to completion across all DSC stages-expansion, depthwise convolution, and projection-by streaming data through a tightly-coupled pipeline without writing to memory. Evaluated on a Xilinx Artix-7 FPGA, our design achieves a speedup of up to 59.3x over the baseline software execution on the RISC-V core. Furthermore, ASIC synthesis projects a compact 0.284 mm This work confirms the feasibility of a zero-buffer dataflow within a TinyML resource envelope, offering a novel and effective strategy for overcoming the memory wall in edge AI accelerators. Edge AI[1] involves running artificial intelligence algorithms directly on local hardware, such as sensors and Internet of Things (IoT) units, bringing computation to the source of data creation. This allows for real-time processing without constant reliance on the cloud, an approach that offers several key benefits: low latency due to local processing, enhanced privacy by keeping sensitive data on the device, and reduced network bandwidth consumption, which enables reliable of-fline operation.[2] A critical subfield of this domain is Tiny Machine Learning (TinyML)[3], which specifically focuses on deploying machine learning models directly onto low-cost, ultra-low-power microcontrollers (MCUs) and embedded systems. These devices operate under severe constraints, often with power budgets in the milliwatt range and with only a few hundred kilobytes of memory, making on-device intelligence a significant technical challenge. The typical TinyML workflow involves taking a fully trained model and optimizing it for on-device inference by applying techniques such as quantization and pruning to create a smaller, more efficient model in a compact format.

accelerator, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2511.21232

Country: Asia > Middle East > Republic of Türkiye (0.28)

Genre: Research Report (0.64)

Industry: Semiconductors & Electronics (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

81f554467f27759e88de14ba2fbafb47-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 07:43:39 GMT

dataset, experiment, information, (17 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Genre: Research Report > Experimental Study (0.93)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

P-TAME: Explain Any Image Classifier with Trained Perturbations

Ntrougkas, Mariano V., Mezaris, Vasileios, Patras, Ioannis

arXiv.org Artificial IntelligenceJan-29-2025

The adoption of Deep Neural Networks (DNNs) in critical fields where predictions need to be accompanied by justifications is hindered by their inherent black-box nature. In this paper, we introduce P-TAME (Perturbation-based Trainable Attention Mechanism for Explanations), a model-agnostic method for explaining DNN-based image classifiers. P-TAME employs an auxiliary image classifier to extract features from the input image, bypassing the need to tailor the explanation method to the internal architecture of the backbone classifier being explained. Unlike traditional perturbation-based methods, which have high computational requirements, P-TAME offers an efficient alternative by generating high-resolution explanations in a single forward pass during inference. We apply P-TAME to explain the decisions of VGG-16, ResNet-50, and ViT-B-16, three distinct and widely used image classifiers. Quantitative and qualitative results show that our method matches or outperforms previous explainability methods, including model-specific approaches. Code and trained models will be released upon acceptance.

artificial intelligence, classifier, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2501.17813

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Rethinking Encoder-Decoder Flow Through Shared Structures

Laboyrie, Frederik, Yucel, Mehmet Kerim, Saa-Garriga, Albert

arXiv.org Artificial IntelligenceJan-24-2025

Dense prediction tasks have enjoyed a growing complexity of encoder architectures, decoders, however, have remained largely the same. They rely on individual blocks decoding intermediate feature maps sequentially. We introduce banks, shared structures that are used by each decoding block to provide additional context in the decoding process. These structures, through applying them via resampling and feature fusion, improve performance on depth estimation for state-of-the-art transformer-based architectures on natural and synthetic images whilst training on large-scale datasets.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2501.14535

Country: Europe > United Kingdom > England > Greater London > London (0.05)

Genre: Research Report (0.50)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.37)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.35)

Add feedback

Semantic Image Synthesis with Unconditional Generator

Neural Information Processing SystemsJan-18-2025, 23:57:46 GMT

intermediate feature map, semantic image synthesis, unconditional generator, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Vision (1.00)

Add feedback

Debias the Training of Diffusion Models

Yu, Hu, Shen, Li, Huang, Jie, Zhou, Man, Li, Hongsheng, Zhao, Feng

arXiv.org Artificial IntelligenceNov-3-2023

Diffusion models have demonstrated compelling generation quality by optimizing the variational lower bound through a simple denoising score matching loss. In this paper, we provide theoretical evidence that the prevailing practice of using a constant loss weight strategy in diffusion models leads to biased estimation during the training phase. Simply optimizing the denoising network to predict Gaussian noise with constant weighting may hinder precise estimations of original images. To address the issue, we propose an elegant and effective weighting strategy grounded in the theoretically unbiased principle. Moreover, we conduct a comprehensive and systematic exploration to dissect the inherent bias problem deriving from constant weighting loss from the perspectives of its existence, impact and reasons. These analyses are expected to advance our understanding and demystify the inner workings of diffusion models. Through empirical evaluation, we demonstrate that our proposed debiased estimation method significantly enhances sample quality without the reliance on complex techniques, and exhibits improved efficiency compared to the baseline method both in training and sampling processes. Diffusion models (Sohl-Dickstein et al., 2015; Ho et al., 2020) have emerged as powerful generative models that garner significant attention recently. Their popularity stems from the remarkable ability to generate diverse and high-quality samples (Dhariwal & Nichol, 2021; Rombach et al., 2022; Ramesh et al., 2022; Nichol & Dhariwal, 2021) as well as the training-stable loss form, compared to the adversarial training paradigms used in Generative Adversarial Networks (GANs) (Goodfellow et al., 2014).

different weighting strategy, diffusion model, weighting strategy, (15 more...)

arXiv.org Artificial Intelligence

2310.08442

Country: Asia > China > Hong Kong (0.04)

Genre: Workflow (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.34)

Add feedback

Privacy-Preserving Inference in Machine Learning Services Using Trusted Execution Environments

Narra, Krishna Giri, Lin, Zhifeng, Wang, Yongqin, Balasubramaniam, Keshav, Annavaram, Murali

arXiv.org Machine LearningDec-7-2019

This work presents Origami, which provides privacy-preserving inference for large deep neural network (DNN) models through a combination of enclave execution, cryptographic blinding, interspersed with accelerator-based computation. Origami partitions the ML model into multiple partitions. The first partition receives the encrypted user input within an SGX enclave. The enclave decrypts the input and then applies cryptographic blinding to the input data and the model parameters. Cryptographic blinding is a technique that adds noise to obfuscate data. Origami sends the obfuscated data for computation to an untrusted GPU/CPU. The blinding and de-blinding factors are kept private by the SGX enclave, thereby preventing any adversary from denoising the data, when the computation is offloaded to a GPU/CPU. The computed output is returned to the enclave, which decodes the computation on noisy data using the unblinding factors privately stored within SGX. This process may be repeated for each DNN layer, as has been done in prior work Slalom. However, the overhead of blinding and unblinding the data is a limiting factor to scalability. Origami relies on the empirical observation that the feature maps after the first several layers can not be used, even by a powerful conditional GAN adversary to reconstruct input. Hence, Origami dynamically switches to executing the rest of the DNN layers directly on an accelerator without needing any further cryptographic blinding intervention to preserve privacy. We empirically demonstrate that using Origami, a conditional GAN adversary, even with an unlimited inference budget, cannot reconstruct the input. We implement and demonstrate the performance gains of Origami using the VGG-16 and VGG-19 models. Compared to running the entire VGG-19 model within SGX, Origami inference improves the performance of private inference from 11x while using Slalom to 15.1x.

enclave, feature map, inference, (16 more...)

arXiv.org Machine Learning

1912.03485

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
North America > United States > Maryland > Baltimore (0.04)

Genre: Research Report (0.83)

Industry:

Information Technology > Security & Privacy (1.00)
Education (0.86)
Health & Medicine (0.68)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Enhancing Adversarial Example Transferability with an Intermediate Level Attack

Huang, Qian, Katsman, Isay, He, Horace, Gu, Zeqi, Belongie, Serge, Lim, Ser-Nam

arXiv.org Machine LearningJul-23-2019

Neural networks are vulnerable to adversarial examples, malicious inputs crafted to fool trained models. Adversarial examples often exhibit black-box transfer, meaning that adversarial examples for one model can fool another model. However, adversarial examples are typically overfit to exploit the particular architecture and feature representation of a source model, resulting in sub-optimal black-box transfer attacks to other target models. We introduce the Intermediate Level Attack (ILA), which attempts to fine-tune an existing adversarial example for greater black-box transferability by increasing its perturbation on a pre-specified layer of the source model, improving upon state-of-the-art methods. We show that we can select a layer of the source model to perturb without any knowledge of the target models while achieving high transferability. Additionally, we provide some explanatory insights regarding our method and the effect of optimizing for adversarial examples in intermediate feature maps.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Machine Learning

1907.10823

Genre: Research Report > Promising Solution (0.48)

Industry:

Transportation (0.76)
Information Technology > Security & Privacy (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Deep multiscale convolutional feature learning for weakly supervised localization of chest pathologies in X-ray images

Sedai, Suman, Mahapatra, Dwarikanath, Ge, Zongyuan, Chakravorty, Rajib, Garnavi, Rahil

arXiv.org Machine LearningAug-22-2018

Localization of chest pathologies in chest X-ray images is a challenging task because of their varying sizes and appearances. We propose a novel weakly supervised method to localize chest pathologies using class aware deep multiscale feature learning. Our method leverages intermediate feature maps from CNN layers at different stages of a deep network during the training of a classification model using image level annotations of pathologies. During the training phase, a set of \emph{layer relevance weights} are learned for each pathology class and the CNN is optimized to perform pathology classification by convex combination of feature maps from both shallow and deep layers using the learned weights. During the test phase, to localize the predicted pathology, the multiscale attention map is obtained by convex combination of class activation maps from each stage using the \emph{layer relevance weights} learned during the training phase. We have validated our method using 112000 X-ray images and compared with the state-of-the-art localization methods. We experimentally demonstrate that the proposed weakly supervised method can improve the localization performance of small pathologies such as nodule and mass while giving comparable performance for bigger pathologies e.g., Cardiomegaly

artificial intelligence, feature map, machine learning, (17 more...)

arXiv.org Machine Learning

1808.0828

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Montana (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback